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ABSTRACT We have developed high-density DNA mi- 
croarrays of yeast ORFs. These microarrays can monitor 
hybridization to ORFs for applications such as quantitative 
differential gene expression analysis and screening for se- 
quence polymorphisms. Automated scripts retrieved sequence 
information from public databases to locate predicted ORFs 
and select appropriate primers for amplification. The primers 
were used to amplify yeast ORFs in 96-welI plates, and the 
resulting products were arrayed using an automated micro 
arraying device. Arrays containing up to 2,479 yeast ORFs 
were printed on a single slide. The hybridization of f luores- 
cently labeled samples to the array were detected and quan- 
titated with a laser confocal scanning microscope. Applica- 
tions of the microarrays are shown for genetic and gene 
expression analysis at the whole genome level. 



The genome sequencing projects have generated and will con- 
tinue to generate enormous amounts of sequence data. The 
genomes of Saccharomyces cerevisiae, Haemophilus influenzae (1), 
Mycoplasma genitalium (2), and Methanococcus jannischii (3) 
have been completely sequenced. Other model organisms have 
had substantial portions of their genomes sequenced as well 
including the nematode Caenorhabditis elegans (4) and the small 
flowering plant Arabidopsis thaliana (5). Given this ever- 
increasing amount of sequence information, new strategies are 
necessary to efficiently pursue the next phase of the genome 
projects — the elucidation of gene expression patterns and gene 
product function on a whole genome scale. 

One important use of genome sequence data is to attempt 
to identify the functions of predicted ORFs within the genome. 
Many of the ORFs identified in the yeast genome sequence 
were not identified in decades of genetic studies and have no 
significant homology to previously identified sequences in the 
database. In addition, even in cases where ORFs have signif- 
icant homology to sequences in the database, or have known 
sequence motifs (e.g., protein kinase), this is not sufficient to 
determine the actual biological role of the gene product. 
Experimental analysis must be performed to thoroughly un- 
derstand the biological function of a given ORFs product. 
Model organisms, such as S. cerevisiae, will be extremely 
important in improving our understanding of other more 
complex and less manipulable organisms. 

To examine in detail the functional role of individual ORFs and 
relationships between genes at the expression level, this work 
describes the use of genome sequence information to study large 
numbers of genes efficiently and systematically. The procedure 
was as follows. (/') Software scripts scanned annotated sequence 
information from public databases for predicted ORFs. («) The 
start and stop position of each identified ORF was extracted 
automatically, along with the sequence data of the ORF and 200 
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bases flanking either side, («*) These data were used to automat- 
ically select PGR primers that would amplify the ORF. (iv) The 
primer sequences were automatically input into the automated 
multiplex oligonucleotide synthesizer (6). (v) The oligonucleo- 
tides were synthesized in 96^well format, and (vt) used in 96-well 
format to amplify the desired ORFs from a genomic DNA 
template, (vii) The products were arrayed using a high-density 
DNA arrayer (7-10). The gene arrays can be used for hybridiza- 
tion with a variety of labeled products such as cDNA for gene 
expression analysis or genomic DNA for strain comparisons, and 
genomic mismatch scanning purified DNA for genotyping (11). 

METHODS 

Script Design. All scripts were written in UNIX Tool Command 
Language. Annotated sequence information from GenBank was 
extracted into one file containing the complete nucleotide se- 
quence of a single chromosome. A second file contained the 
assigned ORF name followed by the start and stop positions of that 
ORF. The actual sequence contained within the specified range, 
along with 200 bases of sequence flanking both sides, was extracted 
and input into the primer selection program PRIMER 05 (White- 
head Institute, Boston). Primers were designed so as to allow 
amplification of entire ORFs. The selected primer sequences were 
read by the 96-well automated multiplex oligonucleotide synthe- 
sizer instrument for primer synthesis. The forward and reverse 
primers were synthesized in two separate 96-well plates in corre- 
sponding wells. All primers were synthesized on a 20-nmoI scale. 

ORF Amplification and Purification. Genomic DNA was iso- 
lated as described (12) and used as template for the amplification 
reactions. Each PCR was done in a total volume of 100 jih A total 
of 0.2 pM each of forward and reverse primers were aliquoted into 
a 96-well PCR plate (Robbins Scientific, Sunnyvale, CA); a master 
mix containing 0.24 mM each dNTP, 10 mMTris (pH 8.5), 50 mM 
MgCl 2 , 25 units Taq polymerase, and 10 ng of template was added 
to the primers, and the entire mix was thermal cycled for 30 cycles 
as follows: 15 min at 94°C, 15 min at 54°C, and 30 min at 72°C. 
Products were ethanol precipitated in polystyrene v-bottom 96- 
well plates (Costar). All samples were dried and stored at — 20 P C. 

Arraying Procedure and Processing. Microarrays were 
made as described (8). 

A custom built arraying robot was used to print batches of 48 
slides. The robot utilizes four printing tips which simultaneously 
pick up ^1 ul of solution from 96-well microtiter plates. After 
printing, the microarrays were rehydrated for 30 sec in a humid 
chamber and then snap dried for 2 sec on a hot plate (100°C). The 
DNA was then U V crosslinked to the surface by subjecting the 
slides to 60 millijoules of energy. The rest of the poly-L-lysine 
surface was blocked by a 15-min incubation in a solution of 70 mM 
succinic anhydride dissolved in a solution consisting of 315 ml of 
l-methyl-2-pyrrolidinone (Aldrich) and 35 ml of 1 M boric acid 
(pH 8.0). Directly after the blocking reaction, the bound DNA 
was denatured by a 2-min incubation in distilled water at «°95°C 



Abbreviation: YEP, yeast extract/peptone. 
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Fig. 1. Two-color fluorescent scan of a yeast microarray contain- 
ing 2,479 elements (ORFs). The center-to-center distance between 
elements is 345 jim. A probe mixture consisting of cDNA from yeast 
extract/peptone (YEP) galactose (green pseudocolor) and YEP glu- 
cose (red pseudocolor) grown yeast cultures was hybridized to the 
array. Intensity per element corresponds to ORF expression, and 
pseudocolor per element corresponds to relative ORF expression 
between the two cultures. 

The slides were then transferred into a bath of 100% ethano! at 
room temperature. 

Probe Preparation: cDNA. Yeast cultures (100 ml) were grown 
to «*1 ODa^oo and total RNA was isolated as described (13). Up 
to 500 p,g total RNA was used to isolate mRNA (Qiagen, 
Chatsworth, CA). Oligo(dT)20 (5 /ig) was added and annealed to 
2 /xg of mRNA by heating the reaction to 70°C for 10 min and 
quick chilling on ice, plus 2 jxJ Superscript II (200 units//nl) (Life 
Technologies, Gaithersburg, MD), 0.6 p.1 50x dNTP mix (final 
concentrations were 500 fxM dATP, dCTP, dGTP, and 200 juM 
dTTP), 6 m1 5X reaction buffer, and 60 Cy3-dUTP or 
Cy5-dUTP (Amersham). Reactions were carried out at 42°C for 
2 h, after which the mRLNA was degraded by the addition of 0.3 
Ml 5 M NaOH and 03 fx\ 100 mM EDTA and heating to 65°C for 
10 min. The sample was then diluted to 500 fxl with TE and 
concentrated using a Microcon-30 (Amicon) to 10 /xl. 

Probe Preparation: Genomic DNA. Fluorescent DNA was 
prepared from total genomic DNA as follows: 1 pig of random 
nonamer oligonucleotides was added to 2.5 p-g of genomic 
DNA. This mixture was boiled for 2 min and then chilled on 
ice. A reaction mixture containing dNTPs (25 jiM dATP, 
dCTP, dGTP, 10 fiM dTTP, and 40 /xM Cy3-dUTP or 
Cy5-dUTP) reaction buffer (New England Biolabs), and 20 
units exonuclease free Klenow enzyme (United States Bio- 
chemical) was added, and the reaction was incubated at 37°C 
for 2 h. The sample was then diluted to 500 id with TE and 
concentrated using a Microcon-30 (Amicon) to 10 ptl. 

Hybridization. Purified, labeled probe was resuspended in 11 
ix\ of 3.5 X SSC containing 10 pig Escherichia coli tRNA, and 0.3% 
SDS. The sample was then heated for 2 min in boiling water, 
cooled rapidly to room temperature, and applied to the array. The 
array was placed in a sealed, humidified, hybridization chamber. 
Hybridization was carried out for 10 h in a 62°C water bath, after 
which the arrays were washed immediately in 2x SSC/0.2% SDS. 
A second wash was performed in 0.1 X SSC. 

Analysis and Quantitation. Arrays were scanned on a 
scanning laser fluorescence microscope developed by Steve 
Smith with software written by Noam Ziv (Stanford Univer- 
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sity). A separate scan was done for each of the two fluoro- 
phores used. The images were then combined for analysis. A 
bounding box, fitted to the size of the DNA spots, was placed 
over each array element. The average fluorescent intensity was 
calculated by summing the intensities of each pixel present in 
a bounding box and then dividing by the total number of pixels. 
Local area background was calculated for each array element 
by determining the average fluorescent intensity at the edge of 
the bounding box. To normalize for f luorophore-specific vari- 
ation, control spots containing yeast genomic DNA were 
applied to each quadrant during the arraying process. These 
elements were quantitated and the ratios of the signals were 
determined. These ratios were then used to normalize the 
photomultiplier sensitivity settings such that the ratios of the 
fluorescence of the genomic DNA spots were close to a value 
of 1.0. The average signal intensity at any given spot was 
regarded as significant if it was at least two standard deviations 
above background. Each experiment was conducted in dupli- 
cate, with the fluorophores representing each channel re- 
versed. The ratios presented here are the average of the two 
experiments, except in the case in which the signal for the 
element in question was below the reliability threshold. The 
reliability threshold also determined the dynamic range of the 
experiment. For all of the experiments presented, the average 
dynamic range was «*1 to 100. In the case where the fluores- 
cence from a very bright spot saturates the detector, differ- 
ential ratios will, in general, be underestimated. This can be 
compensated for by scanning at a lower overall sensitivity. 

RESULTS 

The accumulation of sequence information from model organ- 
isms presents an enormous opportunity and challenge to under- 
stand the biological function of many previously uncharacterized 
genes. To do this accurately and efficiently, a directed strategy 
was developed that enables the monitoring of multiple genes 
simultaneously. Microarraying technology provides a method by 
which DNA can be attached to a glass surface in a high-density 
format (8). In practice, it is possible to array over 6,000 elements 
in an area less than 1.8 cm 2 . Given that the yeast genome consists 
of **6,100 ORFs, the entire set of yeast genes can be spotted onto 
a single glass slide. 

With this capability and the availability of the entire se- 
quence of the yeast genome, our strategy was to use a directed 
approach for generating the complete genome array. This 
procedure involved synthesizing a pair of oligonucleotide 
primers to amplify each ORF. The PCR product containing 
each gene of interest was arrayed onto glass and used, for 
example, as probe for monitoring gene expression levels by 
hybridizing to the array labeled cDNA generated from isolated 
mRNA of a culture grown under any experimental condition. 

Primer Selection and Synthesis. The primer selection was fully 
automated using Tool Command Language scripts and PRIMER 
0.5. (Whitehead). Primer pairs were automatically selected suc- 
cessfully for >99% of the ORFs tested. Primer sequences can thus 
be selected rapidly with minimal manual processing. A complete 
set of forward and reverse primers were selected initially for each 
ORF on chromosomes I, II, III, V, VI, VIII, IX, X, and XI. 
Primers for a representative set of ORFs (15% coverage) were 
chosen for the remaining chromosomes. With the release of the 
entire yeast genome sequence, the complete set of primers has 
now been selected. 

Because each ORF requires a unique pair of synthetic primers, 
a total of approximately 12,200 oligonucleotides will be required 
to individually amplify each target. This costly component was 
addressed with the automated multiplex oligonucleotide synthe- 
sizer (6) which efficiently synthesizes primers in a 96-well format. 
Each primer, synthesized on a 20-nmol scale, provides enough 
material for 100 amplification reactions, whereas a given PCR 
product provides enough material to generate an element on 
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Table 1. Heat shock vs. control expression data 

Ratio of 
gene expression 



Control 


Heat 


ORF 


Gene 


Description 




2.2 


YLR142 


PUT1 


Proline oxidase 




2.0 


YOL140 


ARG8 


Acetyl ornithine aminotransferase 


2.3 




YGL148 


AR02 


Chorismate synthase 




36.0 


YFL014 


HSP12 


Heat shock protein 




27.4 


YBR072 


HSP26 


Heat shock protein 




6.7 


YBR054 


YR02 


Similarity to HSP30 heat shock protein Yrolp 




3.4 


YCR021 


HSP30 


Heat shock protein 




2.6 


YER103 


SSA4 


Heat shock protein 




2.5 


YLR259 


HSP60 


Mitochondrial heat shock protein HSP60 




2.1 


YBR169 


SSE2 


Heat shock protein of the HSP70 family 




1.7 


YBL075 


SSA3 


Cytoplasmic heat shock protein 




1.4 


YPL240 


HSP82 


Heat shock protein 




1.4 


YDR258 


HSP78 


Mitochondrial heat shock protein of clpb family of ATP-dependent proteases 


1.0 




YNL007 


S1S1 


Heat shock protein 


1.1 




YEL030 




70-kDa heat shock protein 


1.9 




YHR064 




Heat shock protein 




1.3 


YBL008 


HIR1 


Histone transcription regulator 


2.6 




YBL002 


HTB2 


Histone H2B.2 


3.3 




YBL003 


HTA2 


Histone H2A.2 


3.3 




YBR010 


HHT1 


Histone H3 


3.9 




YBR009 


HHF1 


Histone H4 




2.4 


YDR343 


HXT6 


High-affinity hexose transporter 




2.1 


YHR092 


HXT4 


Moderate- to low-affinity glucose transporter 


3.6 




YAR071 


PHOll 


Secreted acid phosphatase, 56 kDa isozyme 




2.3 


YLR096 


KJN2 


Ser/Thr protein kinase 


2.5 




YER102 


RPS8B 


Ribosomal protein S8.e 


2.6 




YBR181 


RPS101 


Ribosomal protein S6.e 


2.6 




YCR031 


CRY1 


40S ribosomal protein S14.e 


2.7 




YLR441 


RP10A 


Ribosomal protein S3.a.e 


2.8 




YHR141 


RPL41B 


Ribosomal protein L36a.e 


2.8 




YBL072 


RPS8A 


Ribosomal protein S8.e 


2.8 




YHL015 


URP2 


Ribosomal protein 


2.8 




YBR191 


URP1 


Ribosomal protein L21.e 


3.1 




YLR340 


RPLAO 


Acidic Ribosomal protein LlO.e 


3.3 




YGL123 


SUP44 


Ribosomal protein 




5.8 


YLR194 




Hypothetical protein 



500-1,000 arrays. Thus, a single primer pair provides enough 
starting material for up to «*50,000 arrays. 

Primers were synthesized to amplify yeast ORFs. Primer 
synthesis had a failure rate of <1% in over 18 plates of 
synthesis as determined by standard trityl analysis (6). The 
success rate of the PCR amplifications using the primer pairs 
was 94% based on agarose gel analysis of each PCR. The 
purified PCR products were used to generate arrays. Two 
versions of the arrays were created for the experimental results 
presented here. The first array contained 2,287 elements and 
the second array batch contained 2,479 elements. 

Genome Arrays. The amplified ORFs were arrayed onto glass 
at a spacing of 345 microns (Fig. 1). The high-density spacing of 
DNA samples allows the hybridization volumes to be mini- 
mized — volumes are a maximum of 10 /xl. The labeled probe can 
thus be maintained at relatively high concentrations, making 1-2 
\x% of mRNA sufficient for analysis. This also obviates the need 
for a subsequent amplification step and thus avoids the risk of 
altering the relative ratios of different cDNA species in the 
sample. 

Genetic Analysis: Genomic Comparison of Unrelated Strains. 

Microarrays allow efficient comparison of the genomes of dif- 
ferent strains. Genomic DNA from Y55, an 5. cerevisiae strain 
divergent from the reference strain S288c, was randomly labeled 
with Cy3-dUTP and hybridized simultaneously with the S288c 
DNA labeled with Cy5-dUTP. When a comparison between the 
hybridization of the DNA from the two strains was done, several 



elements gave relatively little or no signal above background from 
the Cy3 channel (data not shown). These include SGE1, 
ASP3A-D, YLR156, YLR159, YLR161, ENA2 (YDR039 is 
ENA2), and YCR105. These results imply that the regions 
containing these genes are extremely divergent, or all together 
deleted from the strain. Subsequent attempts to generate PCR 
products from SGE1, ENA2, and ASP3A using Y55 DNA failed. 
This result supports the conclusion that these genes are likely to 
be missing from the Y55 genome. It is interesting to note that at 
least two of the regions absent in the Y55 genome have been 
previously shown or suggested to be deleted in mutant laboratory 
strains (14-16). In particular, the Asp-3 region appears to be 
highly prone to being deleted (15, 16). 

These results indicate that gene arrays can be used to efficiently 
screen different strains of an organism for large deletion poly- 
morphisms. A single hybridization and scan will reveal differences 
based on differential hybridization to particular elements. It is 
reasonable to suppose that an equivalent number of genes are 
present in the Y55 genome and absent in the S288c genome. This 
result should be viewed as a minimum estimate of the deletion 
polymorphisms that exist between these two unrelated strains as 
intergenic deletions or small intragenic deletions would not be 
detected because considerable hybridizing material would be 
remain. Sequence polymorphisms, such as deletions, are present 
in populations of every species and must at some level affect 
phenotype. One of the challenges of the genome era will be to 
critically examine sequence polymorphisms that exist in the 
natural gene pool relative to the reference genome sequence. 
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Fig. 2. ORF categories displaying dif- 
ferential expression between heat shocked 
and untreated cultures. Bars within cate- 
gories correspond to individual ORFs. 
Green shaded bars correspond to relative 
increases in ORF expression under 25°C 
growth conditions. Red shaded bars cor- 
respond to relative increases in ORF ex- 
pression under 39°C growth conditions. 



Gene Expression Analysis. The arrays were used to examine 
gene expression in yeast grown under a variety of different 
conditions. Expression analysis is an ideal application of these 
arrays because a single hybridization provides quantitative expres- 



sion data for thousands of genes. To better understand results for 
genes of known function, ORFs were placed in biologically rele- 
vant categories on the basis of function (e.g., amino acid catabolic 
genes) and/or pathways (e.g., the histidine biosynthesis pathway). 



Table 2. Cold shock vs. control expression data 



Ratio of 
gene expression 



Control 


Cold 


ORF 


Gene 


Description 




3.3 


YOR153 


PDR5 


Pleiotropic drug resistance protein 


2.4 




YCR012 


PGK1 


Phosphoglycerate kinase 


2.9 




YCL040 


GLK1 


Aldohexose specific glucokinase 




1.4 


YHR064 




Heal shock protein 


2.0 




YJL034 


KAR2 


Nuclear fusion protein 


2.1 




YDR258 


HSP78 


Mitochondrial heat shock protein of clpb family of ATP-dependent proteases 


2.2 




YLL039 


UBI4 


Ubiquitin precursor 


2.7 




YLL026 


HSP104 


Heat shock protein 


3.1 




YER103 


SSA4 


Heat shock protein 


3.3 




YBR126 


TPS1 


a, a-Trehalose-phosphate synthase (UDP-forming) 


3.8 




YPL240 


HSP82 


Heat shock protein 


7.9 




YBR054 


YR02 


Similarity to HSP30 heat shock protein Yrolp 


7.9 




YBR072 


HSP26 


Heat shock protein 


16.5 




YCR021 


HSP30 


Heat shock protein 


1.8 




YDR343 


HXT6 


High-affinity hexose transporter 


2.1 




YHR096 


HXT5 


Putative hexose transporter 


2.4 




YFR053 


HXK1 


Hexokinase I 


2.8 




YHR092 


HXT4 


Moderate- to low-affinity glucose transporter 


3.4 




YHR094 


HXT1 


Low-affinity hexose (glucose) transporter 




2.3 


YHR089 


GAR1 


Nucleolar rRNA processing protein 




1.7 


YLR048 


NAB IB 


40S ribosomal protein p40 homolog b 




1.7 


YLR441 


RP10A 


Ribosomal protein S3a.e 




1.7 


YLL045 


RPL4B 


Ribosomal protein L7a.e.B 




1.6 


YLR029 


RPL13A 


Ribosomal protein L15.e 




1.6 


YGL123 


SUP44 


Ribosomal protein 




3.1 


YBR067 


TIP1 


Cold- and heat-shock-rnduced protein of the Srpl/Tiplp family 




2.2 


YER011 


TIR1 


Cold-shock-induced protein of the Tirlp, Tiplp family 




2.0 


YCR058 




Hypothetical protein 




4.2 


YKL102 




Hypothetical protein 



* 
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Table 3. Glucose vs. galactose expression data 



Ratio of 
gene expression 



Glucose 


Galactose 


ORF 


Gene 


Description 


2.1 




YHR018 


ARG4 


Arginosuccinate lyase 


3.5 




YPR035 


GLN1 


Glutamate-ammonia ligase 


2.8 




YML116 


ATR1 


Aminotriazole and 4-nitroquinoline resistance protein 


2.0 




YMR303 


ADH2 


Alcohol dehydrogenase II 


3.7 




YBR145 


ADH5 


Alcohol dehydrogenase V 




3.2 


YBL030 


AAC2 


ADP, ATP carrier protein 2 




2.9 


YBR085 


AAC3 


ADP, ATP carrier protein 




2.7 


YDR298 


ATP5 


H + -transporting ATP synthase 8 chain precursor 




2.5 


YBR039 


ATP3 


H + -transporting ATP synthase y chain precursor 




5.5 


YML054 


CYB2 


Lactate dehydrogenase cytochrome b2 




3.4 


YML054 


CYB2 


Lactate dehydrogenase cytochrome b2 




2.3 


YKL150 


MCR1 


Cytochrome-65 reductase 




4.2 


YBL045 


COR1 


Ubiquinol- cytochrome c reductase 44K core protein 




3.5 


YDL067 


COX9 


Cytochrome c oxidase chain VII A 




2.7 


YLR038 


COX12 


Cytochrome c oxidase, subunit VI B 




2.6 


YHR051 


COX6 


Cytochrome c oxidase subunit VI 




2.4 


YLR395 


COX8 


Cytochrome c oxidase chain VIII 




23 


YFR033 


QCR6 


Ubiquinol-cytochrome c reductase 17K protein 




23.7 


YLR081 


GAL2 


Galactose (and glucose) permease 




21.9 


YBR018 


GAL7 


UDP-glucose-hexose- 1 -phosphate uridylyltransf erase 




21.8 


YBR020 


GAL1 


Galactokinase 




19.5 


YBR019 


GAL10 


UDP-glucose 4-epimerase 




14.7 


YLR081 


GAL2 


Galactose (and glucose) permease 




8.6 


YDR009 


GAL3 


Galactokinase 




3.0 


YML051 


GAL80(1) 


Negative regulator for expression of galactose-induced genes 




2.8 


YML051 


GAL80(2) 


Negative regulator for expression of galactose-induced genes 


2.7 




YER055 


HIS1 


ATP phosphoribosyltransferase 


3.4 




YBR248 


HIS7 


Glut amine amidotransferase/cyclase 

Phosphoribosyl-AMP cyclohydrolase/phosphoribosyl-ATP pyrophosphatase/histidinol 


7.4 




YCL030 


HIS4 


dehydrogenase 


5.8 




YKR080 


MTD1 


Methylenetetrahydrofolate dehydrogenase (NAD+) 


6.0 




YDR019 


GCV1 


Glycine decarboxylase T subunit 


6.1 




YLR058 


SHM2 


Serine hydroxymethyliransferase 




8.1 


YML123 


PH084 


High- affinity inorganic phosphate/H + sym porter 


3.5 




YDR408 


ADE8 


Phosphoribosylglycin amide formyltransferase (GART) 


3.6 




YDR408 


ADE8 


Phosphoribosylglycin amide formyltransferase (GART) 


4.4 




YAR015 


ADE1 


Phosphoribosylamidoimidazole-succinocarboxamide synthase 


5.6 




YMR300 


ADE4 


Am idophosphoribosyl transferase 


5.6 




YOR128 


ADE2 


Phosphoribosylaminoimidazole carboxylase 


6.0 




YGL234 


ADE5.7 


Phosphoribosylamine-glycine ligase and phosphoribosylformylglycinamidine cyclo-ligase 




6.3 


YBL015 


ACH1 


Acetyl-CoA hydrolase 



Heat Shock Results. A log phase culture growing in YEP/ 
dextrose medium at 25°C was split in half. One half of the 
culture remained at 25°C whereas the other half of the culture 
was shifted to 39°C. mRNA was isolated from both cultures 1 h 
after heat shock for comparison on microarrays and, although 
this time point is not optimal for measuring induction of heat 
shock mRNAs (17), many known heat shock genes exhibited 
considerable induction at this time point (Table 1; Fig. 2). 
Down-regulation of genes in the ribosomal protein and histone 
gene categories was also observed. Differential expression 
between the heat-shocked culture and the control was also 
observed for many other genes. Genes in many categories, such 
as amino acid catabolism and amino acid synthesis, exhibited 
a mixed response with some genes showing little or no 
differential expression and other genes showing a significant 
increase or decrease in gene expression in response to heat 
shock (Table 1; Fig. 2). 

Cold Shock Results. A log phase culture growing in YEP/ 
dextrose medium at 37°C was split in half. One half of the 
culture remained at 37°C while the other half of the culture was 
shifted to 18°C. mRNA was isolated from both cultures 1 h 
after cold shock for comparison on microarrays. As expected, 



two known cold shock genes (TIP1, TIR1) were expressed at 
a significantly higher level in the cold-shocked culture. Genes 
in other functional categories, such as glucose metabolism and 
heat shock displayed a mixed response with expression of some 
genes being unaffected and other genes exhibiting significant 
up- or down-regulation in response to cold shock (Table 2). 

Steady-State Galactose vs. Glucose Results. mRNA was 
isolated from steady-state log phase YEP galactose and YEP 
glucose grown cultures for comparison on the microarrays. As 
expected, the GAL genes were expressed at a much higher 
level in the galactose culture. Many genes were differentially 
expressed in these cultures that were not a priori expected to 
exhibit differential expression. For example, some genes in the 
amino acid catabolic category were up-regulated in the galac- 
tose culture whereas genes in the one-carbon metabolism and 
purine categories were largely or entirely down-regulated in 
the galactose culture (Table 3). Genes in other categories, such 
as amino acid synthesis, abc transporter, cytochrome c, and 
cytochrome b, exhibited mixed responses; some genes in a 
category showed little or no obvious differential expression 
whereas other genes in the same category showed significant 
differential expression in the galactose and glucose cultures. 



r 
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DISCUSSION 

The results of these experiments show that many genes are 
differentially expressed under the three environmental condi- 
tions described here. The expected and predicted changes in gene 
expression, such as HSP12 in the heat-shocked culture, TIP1 in 
the cold-shocked culture, and GAL2 in the steady-state galactose 
culture, were observed in every case. However, in addition to the 
expected changes in gene expression, significant differential 
expression was also observed for many other genes that would 
not, a priori, be expected to be differentially expressed. For 
example, expression of PHOll decreased and expression of 
YLR194, KIN2, and HXT6 increased in the heat shocked culture. 
Expression of MST1 and APE3 decreased and expression of 
PDR5 and GAR1 increased in the cold-shocked culture. In 
addition, ADE4 and SER2 were expressed at reduced levels 
whereas PH084 and ACH1 were expressed at higher levels in 
cells grown in galactose compared with cells grown in glucose. 
Differential expression of these and many other genes was specific 
to one of these three environmental conditions. 

Many other genes were found to be differentially expressed 
under more than one condition. When differentially expressed 
genes in cold- and heat-shocked cultures were compared, 30 
genes were found in common. Of these 30 genes, 28 showed 
inverse expression (i.e., increased expression under one condition 
and decreased expression under the other condition). Two genes, 
YCR058 and YKL102, showed elevated expression in response to 
both cold and heat shock. Fifteen genes were found to be 
differentially expressed in both the heat-shocked and steady-state 
galactose cultures: 9 genes showed increased expression and 5 
showed decreased expression under both conditions. Twenty 
genes were differentially expressed in both the cold-shocked and 
steady-state galactose cultures: 8 genes showed decreased expres- 
sion and 5 genes showed increased expression under both con- 
ditions. Six genes showed increased expression in the galactose 
culture and decreased expression in the cold shocked culture. 
One gene (ODP1) showed increased expression in both the 
cold-shocked and steady-state galactose cultures. 

Gene expression is affected in a global fashion when environ- 
mental conditions are changed and both expected and unex- 
pected genes are affected. There is also overlap in the genes that 
are differentially expressed under quite different environmental 
conditions. These results can be rationalized by considering the 
high degree of cross-pathway regulation in yeast. For example, 
there is evidence for cross-pathway regulation between (/) carbon 
and nitrogen metabolism (18), («) phosphate and sulfate metab- 
olism (19), and (iu) purine, phosphate, and amino acid metabo- 
lism (20-24). TTiere are also examples of the interaction of 
general and specific transcription factors (25, 26). Finally, within 
the broad class of amino acid biosynthetic genes, there is evidence 
for amino add specific regulation of some genes, regulation via 
general control for other genes, and regulation via both specific 
and general control for other genes (22, 27-30). 

Cross-pathway regulation arises from the complex structure 
of promoters. Virtually all promoters contain sites for multiple 
transcription factors and, therefore, virtually all genes are 
subject to combinatorial regulation. For example, the HIS4 
promoter contains binding sites for GCN4 (the general amino 
acid control transcription factor), PH02/BAS2 (a transcrip- 
tional regulator of phosphatase and purine biosynthetic 
genes), and BAS1 (a transcriptional regulator of purine bio- 
synthetic genes) (31). It is likely that the complex effects on 
gene expression described in this work are a direct conse- 
quence of the combinatorial regulation of gene expression. 

These findings illustrate the power of the highly parallel whole 
genome approach when examining gene expression. The global 
effects of environmental change on gene expression can now be 
directly visualized. It is clear that determining the mechanism(s) 
and the functional role of the dramatic global effects on gene 
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expression in different environments will be a significant chal- 
lenge. The era of whole genome analysis will, ultimately, allow 
researchers to switch from the very focused single gene/promoter 
view of gene expression and instead view the cell more as a large 
complex network of gene regulatory pathways. 

With the entire sequence of this model organism known, new 
approaches have been developed that allow for genome wide 
analyses (32, 33) of gene function. The genome microarrays 
represent a novel tool for genetic and expression analysis of the 
yeast genome. This pilot study uses arrays containing >35% of 
the yeast ORFs and it is clear that the entire set of ORFs from 
the yeast genome can be arrayed using the directed primer based 
strategy detailed here. Recent advances in arraying technology 
will allow all 6,100 ORFs to be arrayed in an area of less than 1.8 
cm 2 . Furthermore, as the technology improves, detection limits 
will allow less than 500 ng of starting mRNA material to be used 
for making probe. 

The genome arrays provide for a robust, fully automated 
approach toward examining genome structure and gene func- 
tion. They allow for comparisons between different genomes 
as well as a detailed study of gene expression at the global level. 
This research will help to elucidate relationships between 
genes and allow the researcher to understand gene function by 
understanding expression patterns across the yeast genome. 

Support was provided by National Institutes of Health Grant 
P0/HG00205. 

1. Flcischmann, R. D., Adams, M. D., White, O., Clayton, R. A., Kirkne&s, E. F. f et 
al. (1995) Science 269, 496-512. 

2. Fraser, C. M, Gocayne, J. D., White, O., Adams, M. D., Clayton, R. A., et ai 

(1995) Science 270, 397-403. 

3. Bult, C J., White, O., Olsen, G. J., Zhou, L., Fleischroann, R. D., et at. (1996) 
Science 273, 1058-1073. 

4. Sulston, J., Du, Z-, Thomas, K., Wilson, R., Hillier, L., et al. (1992) Nature 
(London) 356, 37. 

5. Newman, T.,de Bruijn, F. J., Green, P., Keegstra,K.,Kende, H.,etat. (1994) Plant 
Physiol. 106, 1241-1255. 

6. Lashkari, D. A., Hunicke-Smith, S. P., Norgren, R. M., Davis, R. W. & Brennan, 
T. (1995) Pwc. Natl. Acad. Set. USA 92, 7912-7915. 

7. Schena, M., Shalon, D., Davis, R. W. & Brown, P. O. (1995) Science 270, 467-470. 

8. Shalon, D., Smith, S. & Brown, P. O. (1996) Genome Res. 6, 639-645. 

9. Heller, R. A., Schena, M., Chai, A., Shalon, D., Bedilion, T., Gilmore, J., Woolley, 
D. E. & Davis, R. W. (1997) Proc. Natl. Acad. Sci. USA 94, 2150-2155. 

10. DeRisi, J., Penland, L., brown, P. O., Biltner, M. L., Meltzer, P. S., Ray, M„ Chen, 
Y., Su Ya & Trent, J. M. (1996) Nat. Genet. 14, 457-460. 

11. Nelson, S. F. f McCusker, J. H., Sander, M. A.,Kce, Y., Modrich, P. & Brown P.O. 
(1993) Nat. Genet. 4, 11-18. 

12. Hoffman, C. S. & Winston, F. (1989) Gene 84, 473-479. 

13. Schmitt, M., Brown, T. & Trumpower, B. (1990) Nucleic Acids Res. 18, 3091. 

14. Ehrenhofer-M array, A.E., Wurgler, F. E. & Sengstag, C. (1994) Mot. Gen. Genet. 
244, 287-294. 

15. Kim, K-W., Kamerud, J. O., Livingston, D. M. & Roon, R. J. (1988)/ Biol. Chem. 
263, 11948-11953. 

16. Kim, K.-W. & Roon, R. J. (1984)/ Bacteriol. 157, 958-961. 

17. Craig, E. A. (1992) in The Molecular Biology of the Yeast Sacckaromyces: Gene 
Expression, eds. Jones, E. W., Pringle, J. R. & Broach, J. R. (Cold Spring Harbor 
Lab. Press, Plainview, NY), Vol. 2, pp. 501-537. 

18. Dang. V. D., Bohn, C, Bolotin-Fukuhara, M. & Daignan-Fornier, B. (1996)/ 
Bacteriol. 178, 1842-1849. 

19. O'Connell, K. F. & Baker, R. E. (1992) Genetics 132, 63-73. 

20. Braus, G., Mosch, H. U, Vogel, K., Hinnen, A. & Hutter, R. (1989) EMBOJ. 8, 
939-945. 

21. Mosch, H. U., Scheier, B., Lahli, R., Mantsala, P. & Braus, G. H. (1991)/ Biol. 
Chem. 266, 20453-20456. 

22. Mitchell, A. P. & Magasanik, B. (1984) Mol. Cell. Biol. 4, 2767-2773. 

23. Daignan-Fornier, B. & Fink, G. R. (1992) Proc. Natl. Acad. Sci USA 89, 
6746-6750. 

24. Tice-Baldwin, K., Fink, G. R. & Arndi, K. T. (1989) Science 246, 931-935. 

25. Messenguy, F. & Dubois, E. (1993) Mol. Celt. Biol. 13, 2586-2592. 

26. Devlin, C, Tice-Baldwin, K„ Shore, D. & Arndt, K. T. (1991 ) Mol. Cell. Biol. 11, 
3642-3651. 

27. Magasanik, B. (1992) in The Molecular and Cellular Biology of the Yeast Saccha- 
romyces: Gene Expression, eds. Jones, E. W., Pringle, J. R. & Broach, J. R. (Cold 
Spring Harbor Lab. Press, Plainview, NY), Vol. % pp. 283-317. 

28. Hinnebusch, A. G. (1992) in The Molecular and Cellular Biology of the Yeast 
Saccharomyces: Gene Expression, eds. Jones, E. W., Pringle, J. R. & Broach, J. R. 
(Cold Spring Harbor Lab. Press, Plainview, NY), Vol. 2, pp. 319-414. 

29. Brisco, P. R. & Kohlhaw, G. B. (1990) / Biol. Chem. 265, 11667-1 1675. 

30. O'Connell, K. F., Surdin-Kerjan, Y. & Baker R. E. (1995) Mol. Cell. Biol. 15, 
1879-1888. 

31. Arndt K. T., Styles, C. & Fink, G. R. ( 1987) Science 237, 874-880. 

32. Smith, V., Chou, K. N., Lashkari, D., Botstein, D. & Brown, P. O. (1996) Science 
274, 2069-2074. 

33. Shoemaker, D. D., Lashkari, D. A-, Morris, D., Mittman, M. & Davis, R, W. 

(1996) Nat. Genet. 14, 450-456. 



4 



Exhibit F of Rockett Declarati 
with Response dated 04/16/04 
In USSN: 09/828,423 



Fischer- Vize. Science 270. 1828 (1995). 

35. T. C. James and S. C. Elgin, Mot. Cell Biol. 6, 3862 
(1986); R. Paro and D. S. Hogness, Proc. Natl. Acad. 
Set. USA. 88. 263 (1991); B. Tschiersch ef al., 
EMBO J. 1 3, 3822 (1 994); M. T. Madireddi ef al. . Cell 
87. 75 (1996); D. G. Stokes, K. D. Tartof, R. P. Perry, 
Proc. Natl. Acad Set. USA. 93, 7137 (1996). 

36. P. M. Palosaari et al., J. Biol. Chem. 266, 10750 
(1991); A. Schmitz, K. H. Gartemarm, J. Fiedler. E. 



Grund, R. Bchenlaub, Appl. Environ. Microbiol. 58. 
4068 (1992); V. Snarma, K. Suvama. R. Mega* 
nathan, M. E. Hudspeth, J. Bacterid. 174. 5057 

(1992) ; M. Kanazawa ef al.. Enzyme Protein 47, 9 

(1993) ; 2. L Soynton, G. N. Bennet, F. B. Rudolph, 
J. Bacterid. 178, 3015 (1996). 

37. M. Ho ef at.. Ceil 77, 869 (1994). 

38. W. Hendriks ef al. , J. Cell Biochem. 59, 41 8 (1995). 

39. We thank H. Skaletsky and F. Lewitter for hetp with 



sequence analysis; Lawrence Livermore National 
Laboratory for the flow- sorted Y cosmid library, and 
P. Bain, A. Bortvin, A. de la Chapelle, G. Fink. K. 
Jegalian, T. Kawaguchi, E. Lander. H. Lodish, P. 
Matsudaira. D. Menke, U. RajBhandary, R. Reijo, S. 
Rozen, A. Schwartz. C. Sun, and C. Titford for com- 
ments on the manuscript. Supported by NIH. 

28 April 1997; accepted 9 September 1997 



Exploring the Metabolic and Genetic Control of 
Gene Expression on a Genomic Scale 

Joseph L. DeRisi, Vishwanath R. Iyer, Patrick O. Brown* 

DNA microarrays containing virtually every gene of Saccharomyces cerevisiae were used 
to carry out a comprehensive investigation of the temporal program of gene expression 
accompanying the metabolic shift from fermentation to respiration. The expression 
profiles observed for genes with known metabolic functions pointed to features of the 
metabolic reprogramming that occur during the diauxic shift, and the expression patterns 
of many previously uncharacterized genes provided clues to their possible functions. The 
same DNA microarrays were also used to identify genes whose expression was affected 
by deletion of the transcriptional co-repressor TUP1 or overexpresston of the transcrip- 
tional activator YAP1. These results demonstrate the feasibility and utility of this ap- 
proach to genomewide exploration of gene expression patterns. 



The complete sequences of nearly a dozen 
microbial genomes are known, and in the 
next several years we expect to know the 
complete genome sequences of several 
metazoans, including the human genome. 
Defining the role of each gene in these 
genomes will be a formidable task, and un- 
derstanding how the genome functions as a 
whole in the complex natural history of a 
living organism presents an even greater 
challenge. 

Knowing when and where a gene is 
expressed often provides a strong clue as to 
its biological role. Conversely, the pattern 
of genes expressed in a cell can provide 
detailed information about its state. Al- 
though regulation of protein abundance in 
a cell is by no means accomplished solely 
by regulation of mRNA, virtually all dif- 
ferences in cell type or state are correlated 
with changes in the mRNA levels of many 
genes. This is fortuitous because the only 
specific reagent required to measure the 
abundance of the mRNA for a specific 
gene is a cDNA sequence. DNA microar- 
rays, consisting of thousands of individual 
gene sequences printed in a high-density 
array on a glass microscope slide (], 2), 
provide a practical and economical tool 
for studying gene expression on a very 
large scale (3-6). 

Saccharomyces cerevisiae is an especially 
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favorable organism in which to conduct a 
systematic investigation of gene expression. 
The genes are easy to recognize in the ge- 
nome sequence, cis regulatory elements are 
generally compact and close to the tran- 
scription units, much is already known 
about its genetic regulatory mechanisms, 
and a powerful set of tools is available for its 
analysis. 

A recurring cycle in the natural history 
of yeast involves a shift from anaerobic 
(fermentation) to aerobic (respiration) me- 
tabolism. Inoculation of yeast into a medi- 
um rich in sugar is followed by rapid growth 
fueled by fermentation, with the production 
of ethanol. When the fermentable sugar is 
exhausted, the yeast cells turn to ethanol as 
a carbon source for aerobic growth. This 
switch from anaerobic growth to aerobic 
respiration upon depletion of glucose, re- 
ferred to as the diauxic shift, is correlated 
with widespread changes in the expression 
of genes involved in fundamental cellular 
processes such as carbon metabolism, pro- 
tein synthesis, and carbohydrate storage 
(7). We used DNA microarrays to charac- 
terize the changes in gene expression that 
take place during this process for nearly the 
entire genome, and to investigate the ge- 
netic circuitry that regulates and executes 
this program. 

Yeast open reading frames (ORFs) were 
amplified by the polymerase chain reaction 
(PCR), with a commercially available set of 
primer pairs (8). DNA microarrays, con- 
taining approximately 6400 distinct DNA 
sequences, were printed onto glass slides by 



using a simple robotic printing device (9). 
Cells from an exponentially growing culture 
of yeast were inoculated into fresh medium 
and grown at 30°C for 21 hours. After an 
initial 9 hours of growth, samples were har- 
vested at seven successive 2-hour intervals, 
and mRNA was isolated (JO). Fluorescently 
labeled cDN A was prepared by reverse tran- 
scription in the presence of Cy3(green)- 
or Cy5(red)-labeled deoxyuridine triphos- 
phate (dUTP) (II) and then hybridized to 
the microarrays {12}. To maximize the re- 
liability with which changes in expression 
levels could be discerned, we labeled cDNA 
prepared from cells at each successive time 
point with Cy5, then mixed it with a Cy3- 
labeled "reference" cDNA sample prepared 
from cells harvested at the first interval 
after inoculation. In this experimental de- 
sign, the relative fluorescence intensity 
measured for the Cy3 and Cy5 fluors at 
each array element provides a reliable mea- 
sure of the relative abundance of the corre- 
sponding mRNA in the two cell popula- 
tions (Fig. 1). Data from the series of seven 
samples (Fig. 2), consisting of more than 
43,000 expression-ratio measurements, 
were organized into a database to facilitate 
efficient exploration and analysis of the 
results. This database is publicly available 
on the Internet {13). 

During exponential growth in glucose- 
rich medium, the global pattern of gene 
expression was remarkably stable. Indeed, 
when gene expression patterns between the 
first two cell samples (harvested at a 2-hour 
interval) were compared, mRNA levels dif- 
fered by a factor of 2 or more for only 19 
genes (0.3%), and the largest of these dif- 
ferences was only 2.7-fold (14). However, as 
glucose was progressively depleted from the 
growth media during the course of the ex- 
periment, a marked change was seen in the 
global pattern of gene expression. mRNA 
levels for approximately 710 genes were 
induced by a factor of at least 2, and the 
mRNA levels for approximately 1030 genes 
declined by a factor of at least 2. Messenger 
RNA levels for 183 genes increased by a 
factor of at least 4, and mRNA levels for 
203 genes diminished by a factor of at least 
4. About half of these differentially ex- 
pressed genes have no currently recognized 
function and are not yet named. Indeed, 
more than 400 of the differentially ex- 
pressed genes have no apparent homology 
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to any gene whose function is known (15). 
The responses of these previously unchar- 
acterized genes to the diauxic shift therefore 
provides the first small clue to their possible 
roles. 

The global view of changes in expres- 
sion of genes with known functions pro- 
vides a vivid picture of the way in which 
the cell adapts to a changing environ- 
ment. Figure 3 shows a portion of the yeast 
metabolic pathways involved in carbon 
and energy metabolism. Mapping the 
changes we observed in the mRNAs en- 
coding each enzyme onto this framework 
allowed us to infer the redirection in the 
flow of metabolites through this system. 
We observed large inductions of the genes 
coding for the enzymes aldehyde dehydro- 
genase (ALD2) and acetyl-coenzyme 
A(CoA) synthase (ACS!), which func- 
tion together to convert the products of 
alcohol dehydrogenase into acetyl-CoA, 
which in turn is used to fuel the tricarbox- 
ylic acid (TCA) cycle and the glyoxylate 
cycle. The concomitant shutdown of tran- 
scription of the genes encoding pyruvate 
decarboxylase and induction of pyruvate 
carboxylase rechannels pyruvate away 
from acetaldehyde, and instead to oxalac- 
etate, where it can serve to supply the 
TCA cycle and gluconeogenesis. Induc- 
tion of the pivotal genes PCKl, encoding 
phosphoenolpyruvate carboxykinase, and 
FBPJ, encoding fructose 1,6-biphos- 
phatase, switches the directions of two key 
irreversible steps in glycolysis, reversing 
the flow of metabolites along the revers- 
ible steps of the glycolytic pathway toward 
the essential biosynthetic precursor, glu- 
coses-phosphate. Induction of the genes 
coding for the trehalose synthase and gly- 
cogen synthase complexes promotes chan- 
neling of glucose-6-phosphate into these 
carbohydrate storage pathways. 

Just as the changes in expression of 
genes encoding pivotal enzymes can pro- 
vide insight into metabolic reprogram- 
ming, the behavior of large groups of func- 
tionally related genes can provide a broad 
view of the systematic way in which the 
yeast cell adapts to a changing environ- 
ment (Fig. 4). Several classes of genes, 
such as cytochrome c-related genes and 
those involved in the TCA/glyoxylate cy- 
cle and carbohydrate storage, were coordi- 
nately induced by glucose exhaustion. In 
contrast, genes devoted to protein synthe- 
sis, including ribosomal proteins, tRNA 
synthetases, and translation, elongation, 
and initiation factors, exhibited a coordi- 
nated decrease in expression. More than 
95% of ribosomal genes showed at least 
twofold decreases in expression during the 
diauxic shift (Fig. 4) (13). A noteworthy 
and illuminating exception was that the 



genes encoding mitochondrial ribosomal 
genes were generally induced rather than 
repressed after glucose limitation, high- 
lighting the requirement for mitchondrial 
biogenesis (J 3). As more is learned about 
the functions of every gene in the yeast 
genome, the ability to gain insight into a 
cell's response to a changing environment 
through its global gene expression patterns 
will become increasingly powerful. 

Several distinct temporal patterns of ex- 
pression could be recognized, and sets of 
genes could be grouped on the basis of the 
similarities in their expression patterns. The 
characterized members of each of these 
groups also shared important similarities in 
their functions. Moreover, in most cases, 
common regulatory mechanisms could be 
inferred for sets of genes with similar expres- 
sion profiles. For example, seven genes 
showed a late induction profile, with mRNA 
levels increasing by more than ninefold at 



the last timepoint but less than threefold at 
the preceding timepoint (Fig. 5B). All of 
these genes were known to be glucose-re- 
pressed, and five of the seven were previously 
noted to share a common upstream activat- 
ing sequence (UAS), the carbon source re- 
sponse element (CSRE) (16-20). A search 
in the promoter regions of the remaining two 
genes, ACR] and IDP2, revealed that 
ACRI, a gene essential for ACS] activity, 
also possessed a consensus CSRE motif, but 
interestingly, IDP2 did not. A search of the 
entire yeast genome sequence for the con- 
sensus CSRE motif revealed only four, addi- 
tional candidate genes, none of which 
showed a similar induction. 

Examples from additional groups of 
genes that shared expression profiles are 
illustrated in Fig. 5, C through F. The 
sequences upstream of the named genes in 
Fig. 5C all contain stress response ele- 
ments (STRE), and with the exception 
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Fig. 1. Yeast genome microarray. The actual size of the microarray is 18 mm by 18 mm. The 
rnicroarray was printed as described (9). This image was obtained with the same fluorescent 
scanning confocal microscope used to collect all the data we report {49). A fiuorescently labeled 
cDNA probe was prepared from mRNA isolated from cells harvested shortly after inoculation (culture 
density of <5 x 10 6 cells/ml and media glucose level of 19 g/liter) by reverse transcription in the 
presence of Cy3-dUTP. Similarly, a second probe was prepared from mRNA isolated from cells taken 
from the same culture 9.5 hours later (culture density of -2 x 10 s cells/ml, with a glucose level of 
<0.2 g/liter) by reverse transcription in the presence of Cy5-dUTP. in this image, hybridization of the 
Cy3-dUTP-labeled cDNA (that is, mRNA expression at the initial timepoint) is represented as a green 
signal, and hybridization of Cy5-dUTP-labeled cDNA (that is, mRNA expression at 9.5 hours) is 
represented as a red signal. Thus, genes induced or repressed after the diauxic shift appear in this 
image as red and green spots, respectively. Genes expressed at roughly equal levels before and after 
the diauxic shift appear in this image as yellow spots. 
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of HSP42, have previously been shown to 
be controlled at least in part by these 
elements (21-24)- Inspection of the se- 
quences upstream of HSP42 and the two 
uncharacterized genes shown in Fig. 5C> 
YKL026c, a hypothetical protein with 
similarity to glutathione peroxidase, and 
YGR043c, a putative transaldolase, re- 
vealed that each of these genes also pos- 
sess repeated upstream copies of the stress- 
responsive CCCCT motif. Of the 13 ad- 
ditional genes in the yeast genome that 
shared this expression profile [including 
HSP30, ALD2, OM45, and 10 uncharac- 
terized ORFs (25) J, nine contained one or 
more recognizable STRE sices in their up- 
stream regions. 

The heterotrimeric transcriptional acti- 
vator complex HAP2,3,4 has been shown 
to be responsible for induction of several 
genes important for respiration (26-28). 
This complex binds a degenerate consensus 
sequence known as the CCAAT box (26). 
Computer analysis, using the consensus se- 
quence TNRYTGGB (29), has suggested 
that a large number of genes involved in 
respiration may be specific targets of 
HAP2 t 3,4 (30). Indeed, a putative 
HAP2,3,4 binding site could be found in 
the sequences upstream of each of the seven 
cytochrome c-related genes that showed 
the greatest magnitude of induction (Fig. 
5D). Of 12 additional cytochrome c-related 
genes that were induced, HAP2,3,4 binding 
sites were present in all but one. Signifi- 
cantly, we found that transcription of 
HAP4 itself was induced nearly ninefold 
concomitant with the diauxic shift. 

Control of ribosomal protein biogenesis 
is mainly exerted at the transcriptional 
level, through the presence of a common 
upstream-activating element (UAS ) 
that is recognized by the Rapl DNA-bind- 
ing protein (31, 32). The expression pro- 
files of seven ribosomal proteins are shown 
in Fig. 5R A search of the sequences 
upstream of all seven genes revealed con- 
sensus Rapl -binding motifs (33). It has 
been suggested that declining Rapl levels 
in the cell during starvation may be re- 
sponsible for the decline in ribosomal pro- 
tein gene expression (34)- Indeed, we ob- 
served that the abundance of RAP J 
mRNA diminished by 4-4-fold, at about 
the time of glucose exhaustion. 

Of the 149 genes that encode known or 
putative transcription factors, only two, 
HAP4 and S/P4, were induced by a factor of 
more than threefold at the diauxic shift. 
SIP4 encodes a DNA-binding transcrip- 
tional activator that has been shown to 
interact with Snfl, the "master regulator" of 
glucose repression (35). The eightfold in- 
duction of S1P4 upon depletion of glucose 
strongly suggests a role in the induction of 



downstream genes at the diauxic shift. 

Although most of the transcriptional 
responses that we observed were not pre- 
viously known, the responses of many 
genes during the diauxic shift have been 
described. Comparison of the results we 
obtained by DNA microarray hybridiza- 
tion with previously reported results there- 
fore provided a strong test of the sensitiv- 
ity and accuracy of this approach. The 
expression patterns we observed for previ- 
ously characterized genes showed almost 
perfect concordance with previously pub- 
lished results (36). Moreover, the differ- 
ential expression measurements obtained 
by DNA microarray hybridization were re- 
producible in duplicate experiments. For 
example, the remarkable changes in gene 
expression between cells harvested imme- 
diately after inoculation and immediately 
after the diauxic shift (the first and sixth 
intervals in this time series) were mea- 
sured in duplicate, independent DNA mi- 
croarray hybridizations. The correlation 
coefficient for two complete sets of expres- 
sion ratio measurements was 0.87, and for 
more than 95% of the genes, the expres- 



sion ratios measured in these duplicate 
experiments differed by less than a factor 
of 2. However, in a few cases, there were 
discrepancies between our results and pre- 
vious results, pointing to technical limita- 
tions that will need to be addressed as 
DNA microarray technology advances 
(37, 38). Despite the noted exceptions, 
the high concordance between the results 
we obtained in these experiments and 
those of previous studies provides confi- 
dence in the reliability and thoroughness 
of the survey. 

The changes in gene expression during 
this diauxic shift are complex and involve 
integration of many kinds of information 
about the nutritional and metabolic state 
of the cell. The large number of genes 
whose expression is altered and the diver- 
sity of temporal expression profiles ob- 
served in this experiment highlight the 
challenge of understanding the underlying 
regulatory mechanisms. One approach to 
defining the contributions of individual 
regulatory genes to a complex program of 
this kind is to use DNA microarrays to 
identify genes whose expression is affected 



Fig. 2. The section of the ar- 
ray indicated by the gray box 
in Fig. 1 is shown for each of 
the experiments described 
here. Representative genes 
are labeled. In each of the ar- 
rays used to analyze gene 
expression during the diauxic 
shift, red spots represent 
genes that were induced rel- 
ative to the initial timepoint, 
and green spots represent 
genes that were repressed 
relative to the initial timepoint. 
In the arrays used to analyze 
the effects of the tupl A mu- 
tation and YAP1 overexpres- 
sion, red spots represent 
genes whose expression was 
increased, and green spots 
represent genes whose ex- 
pression was decreased by 
the genetic modification. Note 
that distinct sets of genes are 
induced and repressed in the 
different experiments. The 
complete images of each of 
these arrays can be viewed on 
the Internet (13). Cell density 
as measured by optical densi- 
ty (OD) at 600 nm was used to 
measure the growth of the 
culture. 
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by mutations in each putative regulatory 
gene. As a test of this strategy, we analyzed 
the genomewide changes in gene expression 
that result from deletion of the TUP/ gene. 
Transcriptional repression of many genes by 
glucose requires the DNA-binding repressor 



Migl and is mediated by recruiting the tran- 
scriptional co-repressors Tupl and Cyc8/ 
Ssn6 (39). Tupl has also been implicated in 
repression of oxygen-regulated, mating-type- 
specific, and DNA-damage-inducible genes 
(40). 
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Rg. 3. Metabolic reprogramming inferred from global analysis of changes in gene expression. Only key 
metabolic intermediates are identified. The yeast genes encoding the enzymes that catalyze each step 
in this metabolic circuit are identified by name in the boxes. The genes encoding succinyl-CoA synthase 
and glycogen-debranching enzyme have not been explicitly identified, but the ORFs YGR244 and 
YPR184 show significant homology to known succinyl-CoA synthase and glycogen-debranching en- 
zymes, respectively, and are therefore included in the corresponding steps in this figure. Red boxes with 
white lettering identify genes whose expression increases in the diauxic shift. Green boxes with dark 
green lettering identify genes whose expression diminishes in the diauxic shift. The magnitude of 
induction or repression is indicated for these genes. For multimeric enzyme complexes, such as 
succinate dehydrogenase, the indicated fold-induction represents an unweighted average of a!) the 
genes listed in the box. Black and white boxes indicate no significant differential expression (less than 
twofold). The direction of the arrows connecting reversible enzymatic steps indicate the direction of the 
flow of metabolic intermediates, inferred from the gene expression pattern, after the diauxic shift. Arrows 
representing steps catalyzed by genes whose expression was strongly induced are highlighted in red. 
The broad gray arrows represent major increases in the flow of metabolites after the diauxic shift, 
inferred from the indicated changes in gene expression. 



Wild-type yeast cells and cells bearing 
a deletion of the TUP J gene (tupl A) were 
grown in parallel cultures in rich medium 
containing glucose as the carbon source. 
Messenger RNA was isolated from expo- 
nentially growing cells from the two pop- 
ulations and used to prepare cDNA la- 
beled with Cy3 (green) and Cy5 (red), 
respectively {ll). The labeled probes were 
mixed and simultaneously hybridized to 
the microarray. Red spots on the microar- 
ray therefore represented genes whose 
transcription was induced in the tupl A 
strain, and thus presumably repressed by 
Tupl (41). A representative section of the 
microarray (Fig. 2, bottom middle panel) 
illustrates that the genes whose expression 
was affected by the tup] A mutation, were, 
in general, distinct from those induced 
upon glucose exhaustion [complete images 
of all the arrays shown in Fig. 2 are avail- 
able on the Internet (13)]. Nevertheless, 
34 (10%) of the genes that were induced 
by a factor of at least 2 after the diauxic 
shift were similarly induced by deletion of 
TUPJ, suggesting that these genes may be 
subject to TUP J -mediated repression by 
glucose. For example, SVC2, the gene en- 
coding invertase, and all five hexose trans- 
porter genes that were induced during the 
course of the diauxic shift were similarly 
induced, in duplicate experiments, by the 
deletion of TUPJ. 

The set of genes affected by Tupl in this 
experiment also included ct-glucosidases, 
the mating-type-specific genes MFAJ and 
MFA2, and the DNA damage-inducible 
RNR2 and RNR4, as well as genes involved 
in flocculation and many genes of unknown 
function. The hybridization signal corre- 
sponding to expression of TUPJ itself was 
also severely reduced because of the (in- 
complete) deletion of the transcription unit 
in the tub] A strain, providing a positive 
control in the experiment (42). 

Many of the transcriptional targets of 
Tupl fell into sets of genes with related 
biochemical functions. For instance, al- 
though only about 3% of all yeast genes 
appeared to be TUPJ -repressed by a factor 
of more than 2 in duplicate experiments 
under these conditions, 6 of the 13 genes 
that have been implicated in flocculation 
(J5) showed a reproducible increase in 
expression of at least twofold when TUPJ 
was deleted. Another group of related 
genes that appeared to be subject to TUPJ 
repression encodes the serine-rich cell 
wall mannoproteins, such as Tipl and 
Tirl/Srpl which are induced by cold 
shock and other stresses (43), and similar, 
serine-poor proteins, the seripauperins 
(44). Messenger RNA levels for 23 of the 
26 genes in this group were reproducibly 
elevated by at least 2.5-fold in the tup J A 



www.sciencemag.org • SCIENCE • VOL. 278 • 24 OCTOBER 1997 



683 



4 



1 



strain, and 18 of these genes were induced 
by more than sevenfold when TUP I was 
deleted. In contrast, none of 83 genes that 
could be classified as putative regulators of 
the cell division cycle were induced more 
than twofold by deletion of TUPl . Thus, 
despite the diversity of the regulatory sys- 
tems that employ Tupl, most of the genes 
that it regulates under these conditions 
fall into a limited number of distinct func- 
tional classes. 

Because the microarray allows us to 
monitor expression of nearly every gene in 
yeast, we can, in principle, use this ap- 
proach to identify all the transcriptional 
targets of a regulatory protein like Tupl. It 
is important to note, however, that in any 
single experiment of this kind we can only 
recognize those target genes that are nor- 
mally repressed (or induced) under the 
conditions of the experiment. For in- 
stance, the experiment described here an- 
alyzed a MAT a strain in which MFAJ 
and MFA2, the genes encoding the a- 
factor mating pheromone precursor, are 
normally repressed. In the isogenic tup J A 
strain, these genes were inappropriately 
expressed, reflecting the role that Tupl 
plays in their repression. Had we instead 
carried out this experiment with a MATA 
strain (in which expression of MFAJ and 
MFA2 is not repressed), it would not have 
been possible to conclude anything re- 
garding the role of Tupl in the repression 
of these genes. Conversely, we cannot dis- 
tinguish indirect effects of the chronic 
absence of Tupl in the mutant strain from 
effects directly attributable to its partici- 
pation in repressing the transcription of a 
gene. 

Another simple route to modulating the 
activity of a regulatory factor is to overex- 
press the gene that encodes it. YAP I en- 
codes a DNA-binding transcription factor 
belonging to the b-zip class of DNA-bind- 
ing proteins. Overexpression of YAP J in 
yeast confers increased resistance to hydro- 
gen peroxide, o-phenanthroline, heavy 
metals, and osmotic stress (45). We ana- 
lyzed differential gene expression between a 
wild-type strain bearing a control plasmid 
and a strain with a plasmid expressing YAP1 
under the control of the strong GALl-10 
promoter, both grown in galactose (that is, 
a condition that induces YAPJ overexpres- 
sion). Complementary DNA from the con- 
trol and YAP/ overexpressing strains, la- 
beled with Cy3 and Cy5, respectively, was 
prepared from mRNA isolated from the two 
strains and hybridized to the microarray. 
Thus, red spots on the array represent genes 
that were induced in the strain overexpress- 
ing YAP I. 

Of the 17 genes whose mRNA levels 
increased by more than threefold when 



YAP1 was overexpressed in this way, five 
bear homology to aryl-alcohol oxidoreduc- 
tases (Fig. 2 and Table 1). An additional 
four of the genes in this set also belong to 
the general class of dehydrogenases/oxi- 
doreductases. Very little is known about 
the role of aryl-alcohol oxidoreductases in 
S. cerevisiae, but these enzymes have been 
isolated from ligninolytic fungi, in which 
they participate in coupled redox reac- 
tions, oxidizing aromatic, and aliphatic 
unsaturated alcohols to aldehydes with the 
production of hydrogen peroxide (46, 47). 
The fact that a remarkable fraction of the 
targets identified in this experiment be- 
long to the same small, functional group of 
oxidoreductases suggests that these genes 



might play an important protective role 
during oxidative stress. Transcription of a 
small number of genes was reduced in the 
strain overexpressing Yapl. Interestingly, 
many of these genes encode sugar per- 
meases or enzymes involved in inositol 
metabolism. 

We searched for Yapl -binding sites 
(TTACTAA or TGACTAA) in the se- 
quences upstream of the target genes we 
identified (48). About two-thirds of the 
genes that were induced by more than 
threefold upon Yapl overexpression had 
one or more binding sites within 600 bases 
upstream of the start codon (Table 1), sug- 
gesting that they are directly regulated by 
Yapl. The absence of canonical Yapl-bind- 
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Fig. 4. Coordinated reg- 
ulation of functionally re- 
lated genes. The curves 
represent the average in- 
duction or repression ra- 
tios for all the genes in 
each indicated group. 
The total number of 
genes in each group was 
as follows: ribosomal 
proteins, 112; translation 
elongation and initiation 

factors, 25; tRNA synthetases (excluding mitochondial synthetases), 17; glycogen and trehalose syn- 
thesis and degradation, 15; cytochrome c oxidase and reductase proteins, 19; and TCA- and glyoxy- 
late-cycle enzymes, 24. 

Table 1 . Genes induced by YAP1 overexpression. This list includes all the genes for which mRNA levels 
increased by more than twofold upon YAP1 overexpression in both of two duplicate experiments, and 
for which the average increase in mRNA level in the two experiments was greater than threefold (50). 
Positions of the canonical Yapt binding sites upstream of the start codon, when present, and the 
average fold-increase in mRNA levels measured in the two experiments are indicated. 
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ing sites upstream of the others may reflect 
an ability of Yapl to bind sites that differ 
from the canonical binding sites, perhaps in 
cooperation with other factors, or less like- 
ly, may represent an indirect effect of Yapl 
overexpression, mediated by one or more 
intermediary factors. Yapl sites were found 
only four times in the corresponding region 
of an arbitrary set of 30 genes that were not 
differentially regulated by Yapl. 

Use of a DNA microarray to character- 
ize the transcriptional consequences of 
mutations affecting the activity of regula- 
tory molecules provides a simple and pow- 
erful approach to dissection and character- 
ization of regulatory pathways and net- 



works. This strategy also has an important 
practical application in drug screening. 
Mutations in specific genes encoding can- 
didate drug targets can serve as surrogates 
for the ideal chemical inhibitor or modu- 
lator of their activity. DNA microarrays 
can be used to define the resulting signa- 
ture pattern of alterations in gene expres- 
sion, and then subsequently used in an 
assay to screen for compounds that repro- 
duce the desired signature pattern. 

DNA microarrays provide a simple and 
economical way to explore gene expres- 
sion patterns on a genomic scale. The 
hurdles to extending this approach to any 
other organism are minor. The equipment 



required for fabricating and using DNA 
microarrays (9) consists of components 
that were chosen for their modest cost and 
simplicity. It was feasible for a small group 
to accomplish the amplification of more 
than 6000 genes in about 4 months and, 
once the amplified gene sequences were in 
hand, only 2 days were required to print a 
set of 110 microarrays of 6400 elements 
each. Probe preparation, hybridization, 
and fluorescent imaging are also simple 
procedures. Even conceptually simple ex- 
periments, as we described here, can yield 
vast amounts of information. The value of 
the information from each experiment of 
this kind will progressively increase as 
more is learned about the functions of 
each gene and as additional experiments 
define the global changes in gene expres- 
sion in diverse other natural processes and 
genetic perturbations. Perhaps the greatest 
challenge now is to develop efficient 
methods for organizing, distributing, inter- 
preting, and extracting insights from the 
large volumes of data these experiments 
will provide. 
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Fig. 5. Distinct temporal patterns of induction or repression help to group genes that share regulatory 
properties. (A) Temporal profile of the cell density, as measured by OD at 600 nm and glucose 
concentration in the media. (B) Seven genes exhibited a strong induction (greater than ninefold) only at 
the last timepoint (20.5 hours). With the exception of IDP2, each of these genes has a CSRE UAS. There 
were no additional genes observed to match this profile. (C) Seven members of a class of genes marked 
by early induction with a peak in mRNA levels at 18.5 hours. Each of these genes contain STRE motif 
repeats in their upstream promoter regions. (D) Cytochrome c oxidase and ubiquinol cytochrome c 
reductase genes. Marked by an induction coincident with the diauxic shift, each of these genes contains 
a consensus binding motif for the HAP2.3.4 protein complex. At least 17 genes shared a similar 
expression profile. (E) SAM 7, GPP1, and several genes of unknown function are repressed before the 
diauxic shift, and continue to be repressed upon entry into stationary phase, (F) Ribosoma! protein 
genes comprise a large class of genes that are repressed upon depletion of glucose. Each of the genes 
profiled here contains one or more RAP1 -binding motifs upstream of its promoter. RAP1 is a transcrip- 
tional regulator of most ribosomal proteins. 
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tion, the bound DNA was denatured by a 2-min in- 
cubation in distilled water at — 95°C. The slides were 
then transferred into a bath of 1 00% ethanol at room 
temperature, rinsed, and then spun dry in a cfinical 
centrifuge. Slides were stored in a closed box at 
room temperature until used. 
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vessel, was inoculated with 2 ml of a fresh over- 
night culture of yeast strain DBY7286 (MATa, ura3, 
GAL2). The fermentor was maintained at 30°C with 
constant agitation and aeration. The glucose con- 
tent of the media was measured with a UV test kit 
(Boehringer Mannheim, catalog number 716251) 
Cell density was measured by OD at 600- nm wave- 
length. Aliquots of culture were rapidly withdrawn 
from the fermentation vessel by peristaltic pump, 
spun down at room temperature, and then flash 
frozen with liquid nitrogen. Frozen cells were stored 
at -80°C. 

1 1 . Cy3-dUTP or Cy5-dUTP (Amersham) was incorpo- 
rated during reverse transcription of 1.25 p.g of 
polyadenylated lpory(A) + ] RNA, primed by a dT(16) 
oligomer. This mixture was heated to 70°C for 10 
min, and then transferred to ice. A premixed solu- 
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buffer, deoxyribonucleoside triphosphates, and flu- 
orescent nucleotides, was added to the RNA. Nu- 
cleotides were used at these final concentrations: 
600 m-M for dATP, dCTP. and dGTP and 200 

for dTTP. Cy3-dUTP and CyS-dLTTP were used at 
a final concentration of 100 jjlM. The reaction was 
then incubated at 42°C for 2 hours. Unincorporat- 
ed fluorescent nucleotides were removed by first 
diluting the reaction mixture with of 470 of 10 
mM tris-HCI (pH 8.0)/ 1 mM EDTA and then subse- 
quently concentrating the mix to —5 pi, using Cen- 
tricon-30 microconcentrators (Amicon). 

1 2. Purified, labeled cDNA was resuspended in 1 1 pJi of 
3.5 x SSC containing 10 p.g poty(dA) and 0.3 jlI of 
10% SDS. Before hybridization, the solution was 
boiled for 2 min and then allowed to cool to room 
temperature. The solution was applied to the mi- 
croarray under a cover slip, and the slide was 
placed in a custom hybridization chamber which 
was subsequently incubated for -8 to 1 2 hours in 
a water bath at 62°C. Before scanning, slides were 
washed in 2x SSC, 0.2% SDS for 5 min, and then 
0.05x SSC for 1 min. Slides were dried before 
scanning by centrifugation at 500 rpm in a Beck- 
man CS-6R centrifuge. 

13. The complete data set is avaPable on the Internet at 
cmgm.stanrford.edu/pbrown/expiore/index.html 

14. For 95% of all the genes analyzed, the mRNA levels 
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than 1.5. The correlation coefficient for the compar- 
ison between mRNA levels measured for each gene 
in these two different mRNA samples was 0.98. 
When duplicate mRNA preparations from the same 
cell sample were compared in the same way, the 
correlation coefficient between the expression levels 
measured for the two samples by comparative hy- 
bridization was 0.99. 

15. The numbers and identities of known and putative 
genes, and their homologies to other genes, were 
gathered from the following public databases: Sac- 
charomyces Genome Database (genome-www. 
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not directly involved in carbon metabolism but 
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BUSINESS/FINANCIAL DESK 

Human Genome Placed on Chip; Biotech Rivals Put It Up 
for Sale 

By ANDREW POLLACK (NYT) 1030 words 

The genome on a chip has arrived. 

Melding high technology with biology, several companies are rushing to sell slivers of glass or 
nylon, some as small as postage stamps, packed with pieces of all 30,000 or so known human 
genes. 

The new products will allow scientists to scan all genes in a human tissue sample at once, to 
determine which genes are active, a job that previously required two or more chips. The whole- 
genome chips will lower the cost and increase the speed of a widely used test that has 
transformed biomedical research in the last few years. 

"It's sort of a milestone event, very similar to generating an integrated circuit of the genome," 
said Stephen P. A. Fodor, the chief executive of Affymetrix Inc., the leading seller of gene chips, 
which are also called microarrays. 

Affymetrix, based in Santa Clara, Calif., is expected to announce today that it is accepting orders 
for its whole-genome chip. 

The announcement seems timed to steal some thunder from the rival Agilent Technologies, 
which is based in nearby Palo Alto. Agilent is to be the host of an analyst meeting today and it 
plans to announce then that it has started shipping test versions of its whole-genome chip. 

Applied Biosystems of Foster City, Calif., a unit of the Applera Corporation, started the race in 
July with an announcement that it would have a whole-genome chip out by the end of this year. 
NimbleGen Systems, a small company in Madison, Wis., announced a few days later that it had a 
genome on a chip that it was not selling but that it was using to run tests for customers. 

Gene chips, which detect genes that are active, meaning they are being used to make a protein, 
have become essential tools. Scientists try to understand the genetic mechanisms of disease by 
seeing which genes are turned on in, say, a sick kidney or lung compared with those active in a 
healthy organ. Pharmaceutical companies look at gene activity patterns to try to predict the 
effects of drugs. 



Scientists have found that tumors that look the same under the microscope can differ in terms of 
which genes are active. So studying gene patterns could become a way to discriminate between 
deadly and not-so-deadly tumors, or to predict which drug will work best for a particular patient. 

Still, even some vendors conceded that the change from two chips to one is more symbolic than 
revolutionary. 

"You can do just as good science with two chips, it costs you a little more," said Roland Green, 
the vice president for research and development at NimbleGen. 

Some scientists questioned whether the chips really have all human genes, because the exact 
number and identities of all the genes is not known. 

The advent of the genome on a chip is, however, evidence that biotechnology, to the extent that it 
uses electronics, is experiencing some of the rapid progress that has made semiconductors and 
computers continuously cheaper and smaller. 

"One of the effects everyone is looking for in the genomics area is Moore's law — more data, less 
money," said Doug Dolginow, an executive vice president at Gene Logic, which sells data from 
gene chip studies to pharmaceutical companies. "This is a step in that direction." 

Moore's law states that the number of transistors on a semiconductor chip doubles every 18 
months. 

Affymetrix's gene chips are, in fact, made with the same techniques used to make semiconductor 
chips. In the mid-1990's, the company came out with a set of five chips covering what was then 
known of the human genome. After the human genome sequence was virtually completed in 
2000, the company developed a two-chip set with all the known genes. Now it has the single 
chip, which some scientists say will be more convenient. 

,f We like to be able to look at all genes at one time to get a global view of what's going on," said 
John R. Walker, who runs gene chip operations at the Genomics Institute of the Novartis 
Research Foundation in San Diego. 

Costs should also be lower. Gene chips have been so expensive that many academic scientists 
still make their own rather than buy them. Affymetrix said it would sell its whole-genome chips 
for $300 to $500 each, depending on volume, little more than half the price of the two-chip set. 
The other companies have not announced prices. 

For Affymetrix, a successful whole-genome chip "is essential for them to maintain their 
dominance" of high-end microarrays, said Edward A. Tenthoff, an analyst at U.S. Bancorp Piper 
Jaffray. Affymetrix had total product sales in 2002 of about $250 million, and a company 
spokesman said that human genome chips are its top-selling product. 

Mr. Tenthoff, who recommends Affymetrix stock, said the company's sales growth rate had 
moderated as it faces tougher competition. Agilent, a spinoff of Hewlett-Packard that makes its 
gene chips by printing DNA components onto glass slides using ink jet printers, has gained 
share, he said. Applied Biosystems, the largest maker of genomics equipment over all, will be 
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entering the microairay segment of the business with its whole-genome chip, emphasizing the 
connection of that product to the others it offers, including the gene database developed by its 
sister company, Celera Genomics. 

Jeffrey Trent, scientific director of the Translational Genomics Research Institute in Phoenix, 
said that while whole-genome chips are useful for medical discovery, the biggest growth of the 
market will be for chips that can be used by doctors to do diagnoses. And whole-genome chips 
are too cumbersome for that, he said. Rather, once scientists use the whole-genome chips to find 
particular genes that are associated with, say, tumor aggressiveness or drug effectiveness, he 
said, they will then make smaller and cheaper chips containing just those genes for use in 
diagnosis. 
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Agilent Technologies ships whole human genome on single 
microarray to gene expression customers for evaluation 

Company to introduce first commercial whole human microarray by end of year 
PALO ALTO, Calif., Oct. 2, 2003 
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Agilent Technologies Inc. (NYSE: A) today announced it has shipped whole human-genome microarrays 
to customers for testing and evaluation. The whole genome microarray is based on Agilent's new double- 
density format, which can accommodate 44,000 features on a single 1"x3" glass-slide microarray. The 
new platform enables drug-discovery and disease researchers to perform whole-genome screening at a 
lower cost and with higher reproducibility. 

This is an important step toward our release of the first whole human-genome microarray product, which 
is expected to be available for order before the end of the year " said Barney Saunders, vice president 
and general manager of Agilent's BioResearch Solutions Unit. " Customers have long wanted a one- 
sample, one-chip format with the increased sensitivity associated with 60-mer probes. The cost savings 
and high-quality performance make this product a compelling alternative for scientists who make their 
own microarrays." 

Agilent's microarrays are based on the industry-standard 1" x 3" (25mm x 75mm) format, which is 
compatible with most commercial microarray scanners. All Agilent commercial microarrays are developed 
using content from public databases and proprietary sources, with full sequence and annotation 
information made available to customers. Gene sequences for probes are developed using algorithms 
and then validated empirically through iterative wet-lab testing procedures. The result is a microarray 
comprised of functionally validated probes, with the most up-to-date and comprehensive genome 
information commercially available. 
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Advantages of the double-density format include: 



• Lower cost. Not only is one microarray less expensive than two, it requires fewer reagents and 
reduces instrumentation demands. 

• Streamlined workflow. Researchers need prepare and process only one microarray instead of 
two. This also results in fewer steps in the subsequent data analysis. 

• Greater reproducibility. Use of a singie microarray further reduces unnecessary variability in 
experimental conditions. 

• Smaller sample use. A smaller quantity of sample material is required to perform an experiment. 



Availability 



Agilent's Whole Human Genome Microarray is expected to be available for order by the end of the year. 



About Agilent Technologies 



Agilent Technologies Inc. (NYSE: A) is a global technology leader in communications, electronics, life 
sciences and chemical analysis. The company's 30,000 employees serve customers in more than 110 
countries. Agilent had net revenue of $6 billion in fiscal year 2002. Information about Agilent is available 
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on the Web at www.aqilent.com . 
Forward-Looking Statements 

This news release contains forward-looking statements (including, without limitation, statements relating 
to Agilent's expectation that its whole-genome microarray platform will be available for order before the 
end of 2003) that involve risks and uncertainties that could cause results to differ materially from 
management's current expectations. These and other risks are detailed in the company's filings with the 
Securities and Exchange Commission, including its Annual Report on Form 10-K for the year ended Oct. 
31, 2002, its Quarterly Report on Form 10-Q for the quarter ended July 31. 2003 and its Current Report 
on Form 8-K filed Aug. 18, 2003. The company assumes no obligation to update the information in this 
press release. 
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Affymetrix Announces Commercial Launch of Single Array for Human Genome 
Expression Analysis 




Affymetrix GeneChip(R) Brand Human Genome U133 Plus 2.0 Array. 
(PRNewsFoto)[ASJ 



AFFYMETRIX GENECHIP(R) BRAND HUMAN GENOME U133 PLUS 2.0 ARRAY 



SANTA CLARA, CA USA 10/02/2003 




More Than 1 Million Probes Analyze Expression Levels of Nearly 50,000 RNA 
Transcripts and Variants on a Single Array the Size of a Thumbnail 

SANTA CLARA, Calif., Oct. 2 /PRNewswire/ Affymetrix, Inc., 
(Nasdaq: AFFX) announced today that it is taking orders for its new 
GeneChip(R) brand Human Genome U133 Plus 2.0 Array, offering researchers the 
protein-coding content of the human genome on a single commercially available 
catalog microarray. The HG-U133 Plus 2.0 Array analyzes the expression level 
of nearly 50,000 RNA transcripts and variants with 22 different probes per 
transcript, providing superior data quality unmatched by technologies using a 
single probe per transcript. 

( Photo : http://www.newscom.com/cgi-bin/prnh/20031 002/SFTH021 ) 

"With about 1.3 million probes on a chip the size of a human thumbnail, 
the Human Plus Array represents a leap in array technology data capacity, and 
further demonstrates the unique power and potential of our technology to 
explore vast areas of the genome," said Trevor J. Nicholls, Ph.D., Chief 
Commercial Officer. "Multiple independent measurements for each transcript 
ensure that our data quality remains the industry standard, even as our data 
capacity increases dramatically. " 

The HG-U133 Plus 2.0 Array, which will ship in October, combines the 
content of the previous HG-U133 two-array set with nearly 10,000 new probe 
sets representing about €,500 new genes, for a total of nearly 50,000 RNA 
transcripts and variants. This new information, verified against the latest 
version of the publicly available genome map, provides researchers the most 
comprehensive and up-to-date genome-wide gene expression analysis. The probe 
design strategy of the HG-U133 Plus 2.0 Array is identical to the previous HG- 
U13 3 Set, providing very strong data concordance between the two products. 
With more than double the data capacity of the previous -generation Affymetrix 
human product, the HG-U133 Plus 2.0 Array can significantly cut processing and 
analysis time for scientists in the lab, freeing up valuable resources and 
accelerating research. 

The HG-U133 Plus 2.0 Array sets a new standard for the number of genes and 
transcripts on any commercially available single array for human gene 
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expression analysis, while maintaining Affymetrix* unrivaled data quality. The 
HG-U133 Plus 2.0 Array uses 22 independent measures to detect the 
hybridization of each transcript on the array, 1.3 million data points in all, 
more than 3 0 times that of any other microarray technology. Using multiple, 
independent measurements provides optimal sensitivity and specificity, and' the 
most accurate, consistent and statistically significant results possible. 

"More data points produce more reliable results and ultimately, enable 
better science," said Nicholls. "Our powerful probe set strategy gives our 
customers the assurance that their array results actually reflect what's in 
their sample." 

Affymetrix is also launching an updated 11 -micron version of its popular 
18-micron HG-U133A Array called the GeneChip HG-U133A 2.0 Array. The reduced 
feature size on this new design means researchers can use smaller sample 
volumes than on the previous 18 -micron array without compromising performance. 
This new array represents over 20,000 transcripts that can be used to explore' 
human biology and disease processes. All probe sets represented on the 
original GeneChip HG-U133A Array are identically replicated on the GeneChip 
HG-U133A 2.0 Array. 

More information on the design of the HG-U133 Plus 2.0 Array and the 
HG-U133A 2.0 Array may be found on the Affymetrix website at 

http://www.affymetrix.com . 

Affymetrix will be presenting further information on this and other 
products at the BioTechnica trade show in Hanover, Germany on Oct. 7-9, 2003. 
The Company will also hold a press conference on Oct. 7, from 11 a.m. to 
12 p.m. at the show regarding the new Human Genome U133 Plus 2.0 Array. If you 
would like to attend this press conference, please contact Caroline Stupnicka 
at c.stupnicka@northbankcommunications.com . 

About Affymetrix: 

Affymetrix is a pioneer in creating breakthrough tools that are driving 
the genomic revolution. By applying the principles of semiconductor technology 
to the life sciences, Affymetrix develops and commercializes systems that 
enable scientists to improve the quality of life. The Company's customers 
include pharmaceutical, biotechnology, agrichemical , diagnostics and consumer 
products companies as well as academic, government and other non-profit 
research institutes. Affymetrix offers an expanding portfolio of integrated 
products and services, including its integrated GeneChip platform, to address 
growing markets focused on understanding the relationship between genes and 
human health. Additional information on Affymetrix can be found at 
http://www.affymetrix.com . 

All statements in this press release that are not historical are 
"forward-looking statements" within the meaning of Section 21E of the 
Securities Exchange Act as amended, including statements regarding Affymetrix 1 
"expectations," "beliefs," "hopes, » "intentions," "strategies" or the like. 
Such statements are subject to risks and uncertainties that could cause actual 
results to differ materially for Affymetrix from those projected, including, 
but not limited to risks of the Company's ability to achieve and sustain 
higher levels of revenue, higher gross margins, reduced operating expenses, 
uncertainties relating to technological approaches, manufacturing, product' 
development, market acceptance (including uncertainties relating to product 
development and market acceptance of the GeneChip HG-U133 Human Plus 2.0 Array 
and the HG-U133A 2.0), personnel retention, uncertainties related to cost and 
pricing of Affymetrix products, dependence on collaborative partners, 
uncertainties relating to sole source suppliers, uncertainties relating to FDA 
and other regulatory approvals, competition, risks relating to intellectual 
property of others and the uncertainties of patent protection and litigation. 
These and other risk factors are discussed in Affymetrix* Form 10-K for the 
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year ended December 31, 2002 and other SEC reports, including its Quarterly 
Reports on Form 10-Q for subsequent quarterly periods. Affymetrix expressly 
disclaims any obligation or undertaking to release publicly any updates or 
revisions to any forward-looking statements contained herein to reflect any 
change in Affymetrix 1 expectations with regard thereto or any change in 
events, conditions, or circumstances on which any such statements are based. 

NOTE: Affymetrix, the Affymetrix logo, and GeneChip and are registered 
trademarks owned or used by Affymetrix, Inc. 



SOURCE Affymetrix, Inc. 

Web Site: http://www.affymetrix.com 
Photo Notes: NewsCom: 

http://www.newscorn.com/cgi~bin/prnh/20031002/SFTH021 AP Archive: 
http://photoarchive.ap.org PRN Photo Desk, 

photodesk@prnewswire.com 

Issuers of news releases and not PR Newswire are solely responsible for the accuracy of the 
content. 
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Macroresults through Microarrays 

John C. Rockett, Reproductive Toxicology Division (MD-72), National Health and Environmental Effects Research Laboratory, Office of Research 
and Development, US Environmental Protection Agency, Research Triangle Park, 2525 East Highway 54, Durham, NC 2771 1, USA; 
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The third enactment of Cambridge 
Healthtech Institute's Macroresults 
through Microarrays meeting was held 
in Boston (MA, USA) from 29 April- 
1 May 2002. The subtheme of this year's 
meeting was 'advancing drug discov- 
ery', a widely touted application for 
array technology. 

The evolution of microarrays 
If you were asked 'Who first conceived 
of the idea of microarrays', who would 
come to mind? Mark Schena perhaps, 
first author of the seminal 1995 paper 
on cDNA arrays [1]? Maybe Pat Brown, 
Schena's then supervisor? Or perhaps 
Stephen Fodor, the primary driver 
behind Affymetrix's (http://www. 
affymetrtx.com) oligonucleotide-based 
platform [2]. Brits might even chant the 
name of Ed Southern [3]. Well, accord- 
ing to Roger Ekins (University College 
London Medical School; http://www. 
ucl.ac.uk/medicine/) all these answers 
would be wrong. It was in fact Ekins 
and his colleagues who first conceived 
of and patented 'a new generation of 
ultrasensitive, miniaturized assays for 
protein and DNA-RNA measurement 
based on the use of microarrays' in the 
mid 1 980s [4]. The concept and poten- 
tial of array technology was more fully 
described in a later publication, in 
which Ekins et ah (5] concluded that an- 
tibody microspots of -50 urn* could be 
achieved, and that as many as 2 million 
different immunoassays could, in prin- 
ciple, be accommodated on a surface 
area of 1 cm 2 . 

Technological innovation 

In practice, it took a different biological 

molecule (DIM A), a different research 



group, and a leap into microfabri- 
cation technology to even begin 
approaching these kinds of densities 
[Affymetrix patent 6045996 talks of 
one million spots cm- 2 ]. Of course, 
advancing technology is one of the 
driving engines behind the genomics 
juggernaut, and we are already seeing 
'4th generation' machines for fab- 
ricating DNA chips. If the company 
representatives at this meeting are to 
be believed (and their cases seemed 
strong), spotting is out, and in situ 
fabrication of oligonucleotide-based 
'iterative custom arrays' is in. Whether 
you go with the Combimatrix's (http:// 
www.combimatrix.com) electrochemi- 
cally directed synthesis and detection 
system, febit's (http://www.febit.com) 
Ceniom® technology, or Nimblegen's 
(http://www.nimblegen.com) Maskless 
Array Synthesizer technology is a 
matter of personal choice. However, 
each of these machines provides the 
flexibility to design variable length 
oligonucleotide probes from se- 
quences inputted by the user, and then 
perform in situ synthesis of an array. 
Each system also boasts unique advan- 
tages. For example, Combimatrix's 
biological array processor is a semi- 
conductor coated with a 3D layer 
of porous material in which DNA, 
RNA, peptides or small molecules 
can be synthesized or immobilized 
within discrete test sites, while febit's 
Ceniom One® is a fully integrated 
gene-expression analysis system with 
minimal user hands-on time - the 
probe sequences are programmed, the 
RNA samples inserted, and the gene 
expression data is pumped out a few 
hours later. 



Cell- and tissue-based arrays 
Array technology is in most people's 
minds firmly linked with gene-expression 
profiling. Fewer are aware that cell- and 
tissue-based arrays have been devel- 
oped, and how they can provide 
a vital extra dimension to research. In 
support of this, Barry Bochner gave an 
update on the cell-based array system 
that Biolog (http://www.biolog.com) 
has produced for simultaneously mea- 
suring the effects of one gene in the cell 
under thousands of growth conditions 
(see [6] for further details). David Walt 
(Tufts University; http:// www. tufts, 
edu/) is developing single live cell ar- 
rays using optical imaging fiber (OIF) 
technology. An array of microwells is 
fabricated on the face of an OIF at den- 
sities of up to 10 million wells cm- 2 . 
Cells are then added to the wells and 
disperse at an average of one cell per 
well. Physiological and genetic re- 
sponses of each cell are measured via 
fluorescence produced by reporter 
genes (e.g. /ocZ, gfp. Assays performed 
so far include yeast live or dead cell 
assay, microenvironment pH and 
0 2 measurements, promotor responses 
using the lacZ and phoA reporter genes, 
and protein-protein interactions using 
the yeast two-hybrid system. The main 
advantage of this system is that the cells 
remain alive during the assay, which 
means a real-time timecourse can be 
performed and/or the array passed 
from sample to sample. This would be 
useful in, for example, the scanning of 
a combinatorial drug library for specific 
physiological effects. 

Tissue arrays are a useful complemen- 
tary technology to DNA arrays because 
they can be used to help validate and 



804 wvw.drugdiscoverytoday.tom 



1 359-6446/02/S - see front matter O2002 Elsevier Science Ltd. All rights reserved. PI I: SI 359-6446(02)02352-8 



DDT Vol. 7, No. 15 August 2002 



conference | update 



understand the biological and medical 
significance of gene changes discov- 
ered using standard DNA arrays. For 
example, an array of tumor tissues can 
be screened for the protein (using im- 
munohistochemistry), message (using 
in situ hybridization) and copy number 
(using comparative genomic hybridiza- 
tion) of a gene of interest, to determine 
if expression of the gene (or lack 
thereof) is related in any way to sur- 
vival. They can also be used to predict 
the probability of clinical failure of lead 
compounds as a result of toxicity by 
evaluating the distribution of the drug 
targets in normal tissue. Spyro Mousses 
and his co-workers at the National 
Human Genome Research Institute 
(http://www.nhgri.nih.gov/index.html) 
have built such arrays, including a 
multi-tumor array (-5000 specimens, 
and sections from 36 normal and 800 
metastatic tissues) and a normal tissue 
array (76 tissue and 332 cell types). 

The problem with proteins 
It has been said that genomics tells us 
what might happen, transcriptomics 
indicates what should happen, and pro- 
teomics shows what is happening. The 
impact of functional proteomics on 
pharmaceutical R&D is rapidly increas- 
ing, and protein arrays are being used 
increasingly in both basic and applied 
research. Their use lies not only in com- 
parative protein expression and inter- 
action profiling, but also in diagnostics 
and drug discovery. However, an in- 
creasing number of researchers have 
found that protein arrays, like their 
cousins the DNA arrays, present several 
practical obstacles relating to their pro- 
duction and use. For example, in using 
Escherichia coli to produce recombi- 
nant eukaryotic proteins from a single 
expression vector, multiple protein 
products are often produced, suggest- 
ing mixes of truncated or otherwise 
altered proteins. There is also the obvi- 
ous concern that the proteins might 
not be modified in a similar manner to 



eukaryotic systems. Also, an optimal 
method for depositing and binding 
proteins to the selected substrate is 
yet to be determined, as is the best 
way to ensure that they are bound in a 
correctly folded, active conformation. 

Several companies have been address- 
ing these problems. Prolinx (http:// 
www.prolinxinc.com) is one such com- 
pany, and Karin Hughes described their 
Versalinx™ chemistry for producing 
protein, peptide and small-molecule 
arrays. Versalinx™ uses solution-phase 
conjugation followed by immobiliza- 
tion, resulting in functional orientation 
of proteins and peptides on the sub- 
strate surface. It also offers the valuable 
additional benefit of exhibiting low 
non-specific binding. Sense Proteomic 
(http://www.senseproteomic.com) is 
also among those addressing these 
problems to develop robust protein 
arrays for drug discovery and clinical 
applications and has developed func- 
tional protein array formats based on 
specific disease tissues. Subtractive hy- 
bridization is used to identify genes 
with altered expression in breast tumor 
and cystic fibrosis compared to normal 
tissue. A high throughput cloning strat- 
egy (COVET™) is then used to produce 
libraries of genes that are tagged, 
cloned, expressed, purified and finally 
immobilized on glass slides. Initial vali- 
dation studies have shown that the vast 
majority of the immobilized proteins do 
indeed retain biological function. 

Stefan Schmidt and his company 
(CPC Biotech; http://www.gpcbiotech. 
de) have moved past the platform devel- 
opment stage and, with their focus 
firmly on drug discovery, are currently 
developing kinase-profiling arrays. 
Kinases are important targets for phar- 
maceutical drug discovery and therapy, 
and GPC's aim is to simultaneously de- 
tect multiple kinases, obtain activity pro- 
files for different cell types, or analyze 
the ability of drug candidates to inhibit 
kinase activity. To do this, recombinant 
kinase substrates are immobilized on 



membranes, incubated with purified 
kinase, and the-substrates measured for 
the degree of phosphorylation. 

Summary 

Meetings like this, packed with exciting 
discoveries and intriguing and interest- 
ing innovation, heavily emphasize the 
pace at which biotechnology is advanc- 
ing, to the extent that the number of 
options for genomic and proteomic re- 
searchers can become overwhelming. 
Although data analysis is perhaps the 
greatest current concern for array users, 
an increasing challenge will be to deter- 
mine the approaches and technology 
that really work, and to do it in a timely 
manner. 
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